1 Introduction

This tutorial summarizes details of the Joint Graphical Lasso (JGL) algorithm (Danaher et. al 2011), then walks through the steps needed to run the JGL package using a real-world metabolomics dataset.

1.1 Important Notes

Code throughout the document is “hidden” but if you click on “CODE” in the upper right above any R output, you can expand it to view code and (extensive) comments!

All code is available on my GitHub. Please see this link for Ariane Stark’s companion presentation which contains further elaboration on methods and results presented in the Danaher paper.

To run this tutorial on your computer please either clone the entire repository to your computer, double click jgl-torial.Rproj, open the .Rmd file and run the code as is after making sure you have installed the appropriate packages in R (below).

You can alternatively download the .csv and .Rmd files here and run the script after installing appropriate packages (below, first code chunk).

# if you are missing any of the packages please install them before proceeding:

#install.packages("JGL")
#install.packages("igraph", "RColorBrewer", "pheatmap")
#install.packages("tidyverse", "here")

library(JGL)
library(igraph)
library(tidyverse)
library(igraph)
library(RColorBrewer)
library(pheatmap)
#source(here::here("helper_functions.r")) #contains functions I made using code from Kate and Yukun and one function from Koyeje lab GitHub for tuning
theme_set(theme_bw()) #setting ggplot theme 

# read in metabolomics data from Raji, which is on the github for this tutorial
#http://github.com/mljaniczek/jgl-tutorial
dat <- read.csv("hapo_metabolomics_2020.csv")
# this is a funciton I made using code received from Kate Shutta and Yukun Li.
# Made this wrapper function to make the code within the tutorial document shorter.
plot_jgl <- function(
  thisGraph,
  multiplier = 15,
  vertex.size = 4,  
  vertex.label.dist = 2 , 
  vertex.label.cex = 0.5, 
  rescale = F) {
  myLayout =  layout_in_circle(makePrettyGraphFromGraph(thisGraph))*0.8
  lab.locs <- radian.rescale(x=1:length(V(thisGraph)), direction=-1, start=0)
  plot(makePrettyGraphFromGraph(thisGraph, multiplier = multiplier), 
       vertex.size = vertex.size,  
       vertex.label.dist = vertex.label.dist , 
       vertex.label.cex = vertex.label.cex, 
       vertex.label.color=V(thisGraph)$color, 
       layout = myLayout,
       lab.locs = lab.locs,
       vertex.label.degree=lab.locs,
       rescale=F)
}

# this is a function for making red-blue edges to match the edge weights(partial correlation)
# you may need to adjust the multiplier depending on how strong your edges are
# function from Kate and Yukun
makePrettyGraphFromGraph = function(thisGraph, multiplier = 15,redblue=T)
{
  if(redblue == T) my_palette <- brewer.pal(n = 8, name = "RdBu")
  if(redblue == F) my_palette <- brewer.pal(n = 8, name = "PRGn")
  my_palette <- my_palette[8:1]
  E(thisGraph)$width = abs(E(thisGraph)$weight)*multiplier
  E(thisGraph)$color = ifelse(sign(E(thisGraph)$weight)>0,my_palette[1],my_palette[8])
  return(thisGraph)
}


# this is code from this stack exchange: 
# https://stackoverflow.com/questions/23209802/placing-vertex-label-outside-a-circular-layout-in-igraph

radian.rescale <- function(x, start=0, direction=1) {
  c.rotate <- function(x) (x + start) %% (2 * pi) * direction
  c.rotate(scales::rescale(x, c(0, 2 * pi), range(x)))
}

# this function from Koyoje Lab Github
vBIC <- function(X, est_graph, thr=0.001){
  num <- length(X)
  BIC_acc <- 0.
  for(i in 1:num){
    
    data_num <- dim(X[[i]])[1]
    sample_cov <- cov(X[[i]], X[[i]])
    tr_sum <- sum(diag(sample_cov %*% est_graph$theta[[i]]))
    
    log_det <- determinant(est_graph$theta[[i]], logarithm = TRUE)$modulus[1][1]
    
    E <- sum(sum(abs(est_graph$theta[[i]]) >= thr))
    BIC_acc <- BIC_acc + (tr_sum - log_det) + (log(data_num)*E/data_num)
  }
  return(BIC_acc)
}

1.2 Acknowledgements

Thank you to Raji Balasubramanian for the metabolomics data, which was previously presented at ENAR 2021.

Thank you to Kate Shutta and Yukun Li who provided code for the beautiful circular network graphs.

2 Joint Graphical Lasso Overview

2.1 Background

  • We previously discussed details of graphical lasso to estimate precision matrix for Gaussian Graphical Models (GGMs) as in Yuan and Lin (2007) and Friedman et al (2007).

  • Joint graphical lasso builds upon this by estimating multiple, related GGMs from data with observations belonging to distinct classes (for example, cancer vs normal tissue).

  • There are two types of penalty functions we can use within JGL which will adjust how much information to leverage across the classes.

  • Additionally, the algorithm uses a fast alternating directions method of multipliers.

  • Danaher paper has almost 800 citations as of March 2022.

2.2 Notation

  • \(K\) number of classes (\(\geq 2\)). Index classes using \(k\) = 1, … \(K\).
  • \(\Sigma^{-1}_k\): True precision matrix for the kth class
  • \(Y^{(k)}\): \(n_k\) x \(p\) matrix consisting of \(n_k\) observations from the \(k\)th class on a set of \(p\) features which are common to all \(K\) datasets
  • \(S^{(k)}\): Empirical covariance matrix for \(Y^{(k)}\)
  • \(\Theta^{(k)}\): argument to convex optimization problem used for estimating \(\Sigma^{-1}_k\)
  • Index matrix arguments by using \(i\) = 1, …, \(p\) and \(j\) = 1, …, \(p\)
  • \(\lambda_1\) and \(\lambda_2\): non-negative tuning parameters used in penalty function

2.3 Major assumptions

  • We assume the observations within each class are iid.
  • Also assume \(\mu_k\), the mean for each class, is 0. i.e:

\[Y^{(k)}_1, ..., Y^{(k)}_{nk} \sim N(0, \Sigma_k)\]

2.4 Optimization problem

  • Our goal is to estimate \(\Sigma^{-1}_1\), …, \(\Sigma^{-1}_K\) by using penalized log-likelihood approach.

  • Seek {\(\hat{\Theta}\)} by solving:

\[maximize_{\{\Theta\}}\left(\sum^{K}_{k=1}n_k[log\{det(\theta^{(k)}\}-tr(S^{(k)}\Theta^{(k)}) - P(\{\Theta\})\right)\]

  • A major innovation of this paper, is the generalization of the optimization problem to multiple classes, in addition to using the penalty function \(P(\{\Theta\})\), for which the authors provide two different versions.

2.5 Penalty functions

2.5.1 Fused Graphical Lasso

  • Fused Graphical Lasso (FGL) uses the following penalty function:

\[P(\{\Theta\}) = \lambda_1\sum^K_{k=1}\sum_{i \neq j}|\theta^{(k)}_{ij}| + \lambda_2\sum_{k<k'}\sum_{i,j}|\theta^{(k)}_{ij}-\theta^{(k')}_{ij}|\]

  • When \(\lambda_1\) is large, FGL makes sparse estimates of \(\hat{\Theta}^{(1)}, ... , \hat{\Theta}^{(K)}\)
  • When \(\lambda_2\) is large, many elements of \(\hat{\Theta}^{(1)}, ... , \hat{\Theta}^{(K)}\) will be the same across classes
  • So, FGL “borrows information aggressively across classes, encouraging similar network structure and similar edge values”

2.5.2 Group Graphical Lasso

  • Group Graphical Lasso (GGL) uses the following penalty function:

\[P(\{\Theta\}) = \lambda_1\sum^K_{k=1}\sum_{i \neq j}|\theta^{(k)}_{ij}| + \lambda_2\sum_{i \neq j}\left(\sum_{i,j}{\theta^{(k)}_{ij}}^2\right)^{1/2}\]

  • Lasso penalty applied to elements of the precision matrices
  • Group lasso penalty is applied to the (i, j) element across all K precision matrices
  • When \(\lambda_1\) is large, GGL makes sparse estimates of \(\hat{\Theta}^{(1)}, ... , \hat{\Theta}^{(K)}\)
  • So, GGL just encourages a shared pattern of sparsity, not shared edge values (unlike FGL which encourage sharing across both)

3 Application of JGL to metabolomics dataset

3.1 Data Processing

Data comes from Hyperglycemia and Adverse Pregnancy Outcome (HAPO) Study. Metabolites were measured from blood samples taken from pregnant women from four ancestry groups: Afro-Caribbean, Mexican American, Northern European, and Thai.

The data consists of ID variable, ancestry group, and metabolites (p = 51 features). The ancestry groups (K = 4) are balanced (n = 400 each, with 1600 total observations).There is also a variable “fpg” which is fasting plasma glucose, which I omit from this analysis.

Within the data there are three groups of metabolites: Amino Acids, Acycl carnitines, and Other.

# examine ancestry group variable
summary(as.factor(dat$anc_gp))
## ag1 ag2 ag3 ag4 
## 400 400 400 400

3.1.1 Center and Scale Data within Class

We can see that the raw data has different distributions, but assumptions of JGL need data to be centered and scaled.

head(dat)[,1:10]
##       id anc_gp  fpg    mt1_1     mt1_2    mt1_3    mt1_4     mt1_5    mt1_6
## 1 hm0001    ag3 75.6 218.2223  76.99525 19.06366 14.23091  86.75162 135.2109
## 2 hm0002    ag3 84.6 292.6314 136.41320 43.14854 17.77549 120.17344 213.6531
## 3 hm0003    ag4 79.2 361.1135  79.98370 22.15848 13.05497  74.75441 136.1587
## 4 hm0004    ag3 73.8 274.0428  79.76154 19.72682 12.68744  69.34854 146.7262
## 5 hm0005    ag1 91.8 270.9049  88.98260 39.20949 15.73498  95.56857 145.2876
## 6 hm0006    ag4 73.8 271.4034  99.63992 27.81137 20.91344  95.55926 223.7438
##      mt1_7
## 1 64.00578
## 2 91.30156
## 3 83.67878
## 4 72.67280
## 5 48.49431
## 6 68.94172
summary(dat$mt1_1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   115.6   281.8   322.4   328.5   370.6   660.4
summary(dat$mt2_1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -4.631  -3.341  -3.040  -3.044  -2.750  -1.154       1
summary(dat$mt3_1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   17.28   21.39   22.64   22.96   24.64   27.25      11

Center and scale each metabolite feature, to satisfy assumptions of JGL.

# prepare data into list of K datasets as needed as input to JGL function

# make data into long format then center and scale by metabolite

## we do not want to center and scale within each group because that would remove any signal we see, right? 

dat_long <- dat %>%
  select(-fpg) %>%
  pivot_longer(cols = starts_with("mt"),
               names_to = "metabolite",
               values_to = "value",
               values_drop_na = TRUE) %>%
  group_by(metabolite) %>%
  mutate(scaled_value = scale(value, center = TRUE, scale = TRUE)) %>%
  ungroup()

head(dat_long)
## # A tibble: 6 × 5
##   id     anc_gp metabolite value scaled_value[,1]
##   <chr>  <chr>  <chr>      <dbl>            <dbl>
## 1 hm0001 ag3    mt1_1      218.            -1.66 
## 2 hm0001 ag3    mt1_2       77.0           -0.942
## 3 hm0001 ag3    mt1_3       19.1           -1.30 
## 4 hm0001 ag3    mt1_4       14.2           -0.744
## 5 hm0001 ag3    mt1_5       86.8           -0.591
## 6 hm0001 ag3    mt1_6      135.            -1.17
# if you want to look at one metabolite
# dat_long %>% 
#   filter(metabolite %in% c("mt1_1")) %>%
#   ggplot(aes(x=value, color=anc_gp)) +
#     geom_density()+
#   facet_grid(metabolite ~.)

# demonstrative density plot showing centering/scaling of data
dat_long %>% 
  filter(metabolite %in% c("mt1_1", "mt2_1", "mt3_1")) %>%
  ggplot(aes(x=scaled_value, color=anc_gp)) +
    geom_density()+
  facet_grid(metabolite ~.)

3.1.2 Make K n by p matrices

Now the raw data is centered and scaled! But first we need to get it into K n by p matrices as required by the JGL package.

So in our case, we need 4 matrices (4 ancestry groups), each with 400 observations of 51 metabolites.

Note: Normally we would impute any missing data. For simplicity in this code I treated missing data as “0” value.

# make list of ancestry groups
ancestry_groups <- sort(unique(dat$anc_gp))

# filter by ancestry group, then make wide matrix of scaled values
# create a list of the results
# Here I'm using the `purrr::map` to iterate over the 4 ancestry groups and make the list. You could use base R lapply instead if you prefer.
dat_mat_list <- purrr::map(ancestry_groups,
                    ~dat_long %>%
                      filter(anc_gp == .x) %>%
                      select(-c(anc_gp, value)) %>%
                      pivot_wider(names_from = metabolite, 
                                  values_from = scaled_value,
                                  #for simplicity putting 0 as missing values. Normally should do imputation.
                                  values_fill = 0) %>%
                      select(-id) %>%
                      as.matrix())

names(dat_mat_list) <- ancestry_groups

Let’s check to see if our list contains the right structure:

str(dat_mat_list)
## List of 4
##  $ ag1: num [1:400, 1:51] -0.869 -0.474 -0.893 1.082 0.578 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##  $ ag2: num [1:400, 1:51] 0.169 0.698 -1.174 -1.012 -0.591 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##  $ ag3: num [1:400, 1:51] -1.664 -0.541 -0.822 0.152 -0.591 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##  $ ag4: num [1:400, 1:51] 0.4919 -0.8616 0.5032 0.0342 -0.157 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...

It does! So now we have a list of 4 matrices, each with 400 standardized observations of 51 metabolites.

3.1.3 Heatmap to visualize the processed data

Before we proceed, let’s visualize the processed data quick in a heatmap to get an idea of if the data has any patterns.

# making one full wide data frame with all 1600 observations, using the centered-scaled values for metabolites that we processed above. 
met_wide_all <- dat_long %>%
  select(-value) %>%
  pivot_wider(names_from = metabolite, 
              values_from = scaled_value,
              #for simplicity putting 0 as missing values. Normally should do imputation.
              values_fill = 0) %>%
  arrange(anc_gp) #arranging by ancestry group for the heatmap

#make matrix out of dataframe, removing the id and group labels since pheatmap just takes numeric matrix as input
met_wide_all_mat <- t(as.matrix(met_wide_all%>%
  select(-c(id, anc_gp))))
met_wide_all_mat[is.na(met_wide_all_mat)] <- 0
colnames(met_wide_all_mat) <- met_wide_all$id

#pheatmap can annotate your heatmap nicely with colors per class! so I'm preparing those here. 

#first I want the columns annotated by ancestry group
my_sample_col <- data.frame(ancestry = as.factor(met_wide_all$anc_gp))
rownames(my_sample_col) <- met_wide_all$id
# then I want the rows annotated by metabolite group, we have three groups of metabolites mt1, mt2, mt3
met_group <- str_split_fixed(rownames(met_wide_all_mat), "_", 2)[,1] %>%
  str_replace_all(c("mt1" = "Amino Acids",
                  "mt2" = "Acyl carnitines",
                  "mt3" = "Other"))
my_sample_row <- data.frame(met_group = met_group)
row.names(my_sample_row) <- row.names(met_wide_all_mat)

# finally we input the numeric matrix and annotations
pheatmap(met_wide_all_mat, 
         cluster_cols = FALSE, 
         show_colnames = FALSE, 
         show_rownames = FALSE, 
         annotation_col = my_sample_col,
         annotation_row = my_sample_row)

Here the x axis has the 1600 samples, sorted by ancestry group. The y axis has the 51 metabolites, clustered using hierarchical clustering, and with a color label based on metabolite group (there are 3 “groups” in the data, with suffix mt1, mt2, and mt3 which indicate Amino Acids, Acyl carnitines, and other groups respectively).

From this heatmap we can see there are big differences in the values for “Other” metabolite group (pink) across the 4 ancestry groups. metabolites from Amino Acids and Acyl Carnitines groups appear to be more homogeneous across the classes.

Let’s get to joint graphical lasso now! We will essentially be taking the data from each ancestry section and fitting graphical lasso, while also borrowing strength from the other classes which is a strength of the JGL.

3.2 Apply Joint Graphical Lasso

Danaher et. al wrote the R package JGL as companion to their paper. There are two penalty functions for lambda2 described in the paper: Fused Graphical Lasso (FGL) and Group Graphical Lasso (GGL) penalties.

We will first go over some basic results using preset values for lambda1 and lambda2 parameters, visualize the results, then discuss strategies for tuning the parameters.

3.2.1 Using Fused Graphical Lasso penalty

We use the JGL() function from the JGL package, with penalty = "fused" specified.

For now we are just putting in a preselected value for lambda1 and lambda2, but in a later section we will go over methods to tune the parameters.

fgl_results = JGL(Y = dat_mat_list,
                  penalty = "fused",
                  lambda1 = .15,
                  lambda2 = 0.2,
                  return.whole.theta = FALSE)

str(fgl_results) # the theta contains a list of estimated matrices, one for each of the K classes. We will extract the thetas for visualization with igraph. 
## List of 3
##  $ theta            :List of 4
##   ..$ : num [1:39, 1:39] 1.46 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:39, 1:39] 1.46 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:39, 1:39] 1.46 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:39, 1:39] 1.46 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:39] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##  $ theta.unconnected:List of 4
##   ..$ : Named num [1:12] 13.98 16.69 7.94 4.81 8.27 ...
##   .. ..- attr(*, "names")= chr [1:12] "mt3_2" "mt3_3" "mt3_4" "mt3_6" ...
##   ..$ : Named num [1:12] 13.98 16.69 7.94 4.81 8.27 ...
##   .. ..- attr(*, "names")= chr [1:12] "mt3_2" "mt3_3" "mt3_4" "mt3_6" ...
##   ..$ : Named num [1:12] 13.98 16.69 7.94 4.81 8.27 ...
##   .. ..- attr(*, "names")= chr [1:12] "mt3_2" "mt3_3" "mt3_4" "mt3_6" ...
##   ..$ : Named num [1:12] 13.98 16.69 7.94 4.81 8.27 ...
##   .. ..- attr(*, "names")= chr [1:12] "mt3_2" "mt3_3" "mt3_4" "mt3_6" ...
##  $ connected        : logi [1:51] TRUE TRUE TRUE TRUE TRUE TRUE ...
##  - attr(*, "class")= chr "jgl"
print.jgl(fgl_results)
## 
## Number of connected nodes:  39 
## Number of subnetworks in each class:  1 1 1 1 
## Number of edges in each class:  117 117 117 117 
## Number of edges shared by all classes:  117 
## L1 norm of off-diagonal elements of classes' Thetas:  41.60834 41.60834 41.60834 41.60834

The basic code itself is simple, although the package does not come with easy-to-use visualization functions. The following contains a frankenstein mix of code I borrowed from online and from colleagues (thank you Kate and Yukun again!!).

  • Extract the estimated covariance matrices from the JGL() result
  • Use graph_from_adjacency_matrix() function from the igraph package on each of the K estimated covariance matrices
  • Either use plot.igraph() and tweak inputs to make a nice graph, or you can use the cobbled-together code that was passed down from Kate to Yukun to me, each of us has tweeked it slightly to how we like it! I have made the function plot_jgl() in the companion script visualize.r which is available in my GitHub along with this presentation.

By examining the below graphs we can see that the estimated networks, especially for the strong correlations, is largely similar across the four groups.

# extract all estimated covariance matrices from result
inv_covar_matrices <- fgl_results$theta
names(inv_covar_matrices) <- ancestry_groups

#now use `graph_from_adjacency_matrix()` function from igraph to create igraph graphs from adjacency matrices

#again since we have a list covariance matrices I use the map() function to apply igraph::graph_from_adjacency_matrix() over the list

graph_list <- map(ancestry_groups,
                  ~graph_from_adjacency_matrix(
                    -cov2cor(inv_covar_matrices[[.x]]),
                    weighted = T,
                    mode = "undirected",
                    diag = FALSE
                  ))
names(graph_list) <- ancestry_groups

# can use the basic igraph code to plot
# plot.igraph(graph_list[["ag1"]],
#             layout = layout.fruchterman.reingold)

# or can use funcion I made from Kate/Yukun's code to make a pretty circular graph
#plot_jgl(graph_list[[1]], multiplier = 3)

3.2.1.1 FGL Class 1

plot_jgl(graph_list[[1]], multiplier = 3)

3.2.1.2 FGL Class 2

plot_jgl(graph_list[[2]], multiplier = 3)

3.2.1.3 FGL Class 3

plot_jgl(graph_list[[3]], multiplier = 3)

3.2.1.4 FGL Class 4

plot_jgl(graph_list[[4]], multiplier = 3)

3.2.2 Using Group Graphical Lasso penalty

Now run “group” penalty instead in JGL() function. Same procedure as above except with penalty = "group" as input.

## run ggl:
ggl_results = JGL(Y=dat_mat_list,
                  penalty="group",
                  lambda1=.15,
                  lambda2=.2,
                  return.whole.theta=TRUE)
str(ggl_results)
## List of 2
##  $ theta    :List of 4
##   ..$ : num [1:51, 1:51] 1.52 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:51, 1:51] 1.03 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:51, 1:51] 2.06 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   ..$ : num [1:51, 1:51] 1.12 0 0 0 0 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##   .. .. ..$ : chr [1:51] "mt1_1" "mt1_2" "mt1_3" "mt1_4" ...
##  $ connected: logi [1:51] TRUE TRUE TRUE FALSE TRUE TRUE ...
##  - attr(*, "class")= chr "jgl"
print.jgl(ggl_results)
## 
## Number of connected nodes:  31 
## Number of subnetworks in each class:  1 1 1 2 
## Number of edges in each class:  94 95 92 92 
## Number of edges shared by all classes:  84 
## L1 norm of off-diagonal elements of classes' Thetas:  26.26214 21.82969 24.76903 23.12329

Extract estimated covariance matrices and prepare for visualization.

# extract all estimated covariance matrices from result
ggl_inv_covar_matrices <- ggl_results$theta
names(ggl_inv_covar_matrices) <- ancestry_groups

#now use function from igraph to create igraph graphs from adjacency matrices

ggl_graph_list <- map(ancestry_groups,
                  ~graph_from_adjacency_matrix(
                    -cov2cor(ggl_inv_covar_matrices[[.x]]),
                    weighted = T,
                    mode = "undirected",
                    diag = FALSE
                  ))
names(ggl_graph_list) <- ancestry_groups

3.2.2.1 GGL Class 1

plot_jgl(ggl_graph_list[[1]], multiplier = 3)

3.2.2.2 GGL Class 2

plot_jgl(ggl_graph_list[[2]], multiplier = 3)

3.2.2.3 GGL Class 3

plot_jgl(ggl_graph_list[[3]], multiplier = 3)

3.2.2.4 GGL Class 4

plot_jgl(ggl_graph_list[[4]], multiplier = 3)

BUT what we’ve done so far we’ve had to pre-select the lambdas.

Let’s show how we would input an assortment of lambdas and tune those parameters.

3.3 Tuning parameters

I found the Koyejo Lab GitHub repository full of code to assist with different methods to select parameters for JGL.

The below chunk is one example where they use the following process:

  • Do a 2D grid search over a sequence of possible values for lambda1 and lambda2, run JGL() for each
  • Collect Akaike Information Criterion (AIC) for each combination
  • Identify lambda combination with the lowest AIC
  • Built final model with tuned parameters.

You can use this method with both fused or group penalty.

3.3.1 Example tuning lambdas with metabolomics data using GGL

Below I’m making a very small 3x3 grid of lambdas for run-time demonstration purposes (it takes my computer less than a minute, but timing went up substantially when I added a larger grid).

interval_l = 3
lambda.eff <- seq(0.01, 0.3, len = interval_l)
aic_vec <- matrix(NA, length(lambda.eff),length(lambda.eff))

#search grid of lambdas
for(i in 1:length(lambda.eff)){
    for(j in 1:length(lambda.eff)){
        fit00 <- JGL(Y=dat_mat_list,penalty="group",lambda1=lambda.eff[i],lambda2=lambda.eff[j], return.whole.theta=TRUE)
        aic_vec[i,j] <- vBIC(dat_mat_list, fit00, thr=0.0001)
   }
}


min(aic_vec)
## [1] -11.69033
which(aic_vec == min(aic_vec), arr.ind = TRUE)
##      row col
## [1,]   1   2
image(x=lambda.eff,y=lambda.eff,z=aic_vec,
      ylab='lambda_2',xlab='lambda_1')
contour(x=lambda.eff,y=lambda.eff,z=aic_vec, add = TRUE,nlevels = 10)

Identify lambda combination with lowest AIC.

# identify which had min aic
i_idx <- which(aic_vec == min(aic_vec), arr.ind = TRUE)[1,1]
j_idx <- which(aic_vec == min(aic_vec), arr.ind = TRUE)[1,2]

#assign tuned lambdas based on your search
lam_1 <- lambda.eff[i_idx]
lam_2 <- lambda.eff[j_idx]

3.3.2 Run JGL again with tuned parameters

print(paste("lambda_1", lam_1))
## [1] "lambda_1 0.01"
print(paste("lambda_2", lam_2))
## [1] "lambda_2 0.155"
ggl_tuned_results <- JGL(Y=dat_mat_list,
                  penalty="group",
                  lambda1= lam_1,
                  lambda2= lam_2,
                  return.whole.theta=TRUE)

# extract all estimated covariance matrices from result
inv_covar_matrices <- ggl_tuned_results$theta
names(inv_covar_matrices) <- ancestry_groups

#now use `graph_from_adjacency_matrix()` function from igraph to create igraph graphs from adjacency matrices

#again since we have a list covariance matrices I use the map() function to apply igraph::graph_from_adjacency_matrix() over the list

graph_list <- map(ancestry_groups,
                  ~graph_from_adjacency_matrix(
                    -cov2cor(inv_covar_matrices[[.x]]),
                    weighted = T,
                    mode = "undirected",
                    diag = FALSE
                  ))
names(graph_list) <- ancestry_groups

3.3.2.1 Tuned GGL Class 1

plot_jgl(graph_list[[1]], multiplier = 3)

3.3.2.2 Tuned GGL Class 2

plot_jgl(graph_list[[2]], multiplier = 3)

3.3.2.3 Tuned GGL Class 3

plot_jgl(graph_list[[3]], multiplier = 3)

3.3.2.4 Tuned GGL Class 4

plot_jgl(graph_list[[4]], multiplier = 3)

4 References

Patrick Danaher, Pei Wang and Daniela Witten (2011). The joint graphical lasso for inverse covariance estimation across multiple classes. http://arxiv.org/abs/1111.0324